\(\color{#c64329}{\text{WELCOME}}\)

In this page, we provide a sequence of tutorials on how to fit Multi-output gaussian process (MOGP) for Multi-population longevity modeling. Important datasets and R scripts are provided for the users to follow. For further detail explanation of the methods, please visit our paper posted on the arXiv.

The structure of the tutorials is grouped into 3 parts:

\(\color{#c64329}{\text{Data}}\)

For the entire tutorials, we work with mortality data from the Human Mortality Database which provides the aggregated mortality statistics at the national levels for more than 40 developed countries across the globe.

The HMD applies the same consistent set of procedures on each population and presently focuses on developed economies where death registrations and census data are available and reliable. For our analysis we rely on one-year age groups, concentrating on Ages 50–84 (retirement ages most relevant for predictive actuarial analysis) for both genders and calendar Years 1990–2016. The dataset is organized as a large table. The \(n\)th observation for the \(l\)th country contains (i) Age and Year as a pair of independent variables, \((x_{ag}^n,x_{yr}^n)\), and (ii) the logarithm of the observed mortality rate, \[\begin{equation} y^n = \log\bigg[\frac{\text{Death counts at $(x_{ag}^n,x_{yr}^n)$}}{\text{Exposed-to-risk counts at $(x_{ag}^n,x_{yr}^n)$}}\bigg]=\log\bigg[\frac{D^n}{E^n}\bigg]. \end{equation}\] We denote by \(\mathcal{D}_l =\{(x^n,y^n)\}_{n=1}^N\) the dataset for the \(l\)th country.

\(\color{#c64329}{\text{Important notations}}\)

A Gaussian process (GP) is an infinite collection of random variables, any finite number of which follows a multivariate normal (MVN) distribution. As such, a GP \(f \sim GP( m, C)\) is characterized by its mean function \(m(x)\) and its covariance structure \(C(x,x')\). This means that for any vector \(\mathbf{x}=(x^1,\ldots,x^n)\) of \(n\) inputs: \[f(x^1),\ldots,f(x^n) \sim \mathcal{N}\big(\mathbf{m(x)},\mathbf{C(x,x)}\big)\] where \(\mathbf{m(x)}=\mathbb{E}[f(\mathbf{x})]\) is the mean vector of size \(n\) and \(\mathbf{C(x,x)}\) is the \(n\) by \(n\) covariance matrix, \(C(x,x') := \mathbb{E}[(f({x})-{m(x)})(f({x'})-{m(x')})]\).

In a GP regression setup, the latent \(f\) links the observations or output vector \(\mathbf{y}=(y^1,\ldots,y^n)\) to the input vector \(\mathbf{x}\) via: \[y^i = f(x^i) + \epsilon^i,\] where \(\epsilon^i\) is the error term to reflect that we observe only a noisy sample of \(f(x^i)\). In our context, \(x^i\) are the individual cells in a mortality table (so indexed by Age, Year, etc.), \(y^i\) are observed raw log mortality rates, and \(f(x^i)\) is the true mortality rate that would materialize in the absence of any random shocks. We assume that observation noise is Gaussian: \(\epsilon^i \sim \mathcal{N}(0,\sigma^2)\) or \(\epsilon=(\epsilon^1,...,\epsilon^n) \sim \mathcal{N}(\mathbf{0},\mathbf{\Sigma}=\text{diag}(\sigma^2))\). It follows that \(\text{Cov}(y^i,y^j) = \text{Cov}(f(x^i),f(x^j)) + \sigma^2\delta(x^i,x^j)\) and therefore \(\mathbf{y} \sim \mathcal{N}( \mathbf{m(x)}, \mathbf{C(x,x) + \Sigma})\) where \(\delta(x^i,x^j)\) is the Kronecker delta.

  Multi-output Gaussian Processes:

Let \(L\) be the number of different populations considered. To jointly model the \(L\) different outputs, \(\{f_l\}_{1\leq l \leq L}\) we correlate them using the framework of multi-output Gaussian Processes (MOGP) which was introduced in geostatistics under the name of multivariate kriging or co-kriging. The aim of co-kriging is to estimate the under-sampled variables using spatial correlation with other sampled variables.

The vector-valued latent response over the Age-Year input space is defined as: \[\mathbf{f(x)}=(f_1(x^1),\ldots,f_1(x^N),\ldots,f_L(x^1),\ldots,f_L(x^N))=(f_1(\mathbf{x}),\ldots,f_L(\mathbf{x}))\] where the functions \(\{f_l\}_{l=1}^L\) are the log-mortality surface for the corresponding population \(l\). Similar to single-output GP (SOGP), MOGP assumes that the vector-valued \(\mathbf{f}\) follows a GP: \[\mathbf{f} \sim GP(\mathbf{m,C})\] where \(\mathbf{m} \in \mathbb{R}^{LN \times 1}\) is the mean vector whose elements \(\{m_l(\mathbf{x})\}_{l=1}^L\) are the mean functions of each output, and \(\mathbf{C} \in \mathbb{R}^{LN \times LN}\) is the fused covariance matrix.